The World Wide Web by Email ^^^^^^^^^^^^^^^^^^^^^^^^^^^ by Odd de Presno Example: www http://www.biotech.washington.edu/WebCrawler/WebCrawlerExamples.html That's all. Lean back and wait. You will get a page filled with hints on how to use the WebCrawler service. The mail will look like this: .--------------------- Example 5 ---------------------------------------- |Date: Mon, 15 Aug 1994 18:10:44 +0200 |From: daemon@www0.cern.ch (The CERN WWW Team Administration) |Subject: Hints for Searching the WebCrawler Index (was: ) | |This is a test version. Please mail any comments to www-request@info.cern.ch | |The document you requested, which URL is |http://www.biotech.washington.edu/WebCrawler/WebCrawlerExamples.html, |follows | | | Hints for Searching the WebCrawler Index | The WebCrawler knows about a lot of documents, so it pays to make precise | queries. Often, though, you can be too precise, so finding what you want | may take a couple of queries. Here are some suggestions about what to do | when you don't get what you want, some examples to help you out, and | detailed explanation of what happens to your query before it's run. | | WHAT TO DO WHEN... | | Your search produces no results. Check your spelling! If that looks OK, | then try to be less specific in your query. For instance, the query | molecular biotechnology DNA sequencing genetics chromosome human genome | project is too specific -- no one document contains all of those keywords. | Something like molecular biotechnology DNA sequencing is more appropriate. | | Your search produces too many results. Be more specific, and make sure you | have the AND button checked. Try to think of words that uniquely identify | what you're looking for. Some words are of little value, because they | identify lots of documents in the WebCrawler's index. For instance, the | words information and university together identify nearly half the | documents in the index, so they're not very useful in trying to | narrow down the search. | | You get an error from the WebCrawler. The WebCrawler will return an | unfriendly error message if it's too busy, or if it chokes on your query. | If it repeatedly has trouble with your query, please let me know, as I'm | trying to eliminate these problems. Thanks! | |Examples | | Most specific queries work quite well. For instance, if you're looking | for information on the music group They Might Be Giants, search for They | Might Be Giants, or just TMBG. | | Some keywords are found in many places. For example, instead of | searching for kermit, use something more descriptive like kermit | columbia or kermit source code communication. Make sure the | "AND" button is checked. | | To find references to the New York Times, try the query New York Times. | To be more specific, try something like New York Times online newspaper. | |How a query works | | The query is parsed in to keywords on space and punctuation boundaries. | | Each word is folded to lower case, and any endings are stripped (NeXT | Computers becomes next computer). | | Each word is checked against a stop list, to see if it's too common to | worry about (to be or not to be is a null query!). | | Each word is fed to the index, and the resulting lists of documents are | combined. | | bp@cs.washington.edu[1] | |*** References from this document *** |[1] http://www.cs.washington.edu/homes/bp/bp.html | .------------------------------------------------------------------------ The last line of the report is interesting. The "[1]" refers to the following entry in the page's text: bp@cs.washington.edu[1] Interactive WWW users can click at this reference to see the associated page. Those using email must send the URL at the bottom of the report back to the LISTPROC to get it. Actually, there is also a WWWmail command called "deep" that allows you to get all documents in the URL you mentioned. If you replace "www" above with deep http://www.biotech.washington.edu/WebCrawler/WebCrawlerExamples.html you will get both the "Hints" page, and the one giving more information about bp@cs.washington.edu . Note: If the requested document is too large, you'll only get the first 5,000 lines. There may be many such references pointers in the text, as illustrated by this page at URL: http://web2.xerox.com/digitrad |--------------------- Example 6 -------------------------------------- |Date: Mon, 15 Aug 1994 14:03:10 +0200 |From: daemon@www0.cern.ch (The CERN WWW Team Administration) |Subject: Digital Tradition Folk Song Full Text Search (was: ) | |This is a test version. Please mail any comments to www-request@info.cern.ch | |The document you requested, which URL is |http://web2.xerox.com/digitrad, follows | | | Digital Tradition Folk Song Full Text Search | DIGITAL TRADITION FOLK SONG DATABASE | | This is a searchable index of the Digital Tradition Folk Song Database | (April 1994 version). Please read About The Digital Tradition[1] and | Searching Digital Tradition[2]. | |Full Text Search | | You may enter a Search Pattern to select songs from the database. | | Options: search titles[3] or search full text; show matching text or list | titles only[4]; list first 50 or list more (100)[5]; default settings. | |Contents | | Keywords List[6] | | Titles List[7] | | Tunes List[8] | | (DT of April 1994) | |*** References from this document *** |[1] http://web2.xerox.com/docs/DigiTrad/AboutDigiTrad.html |[2] http://web2.xerox.com/docs/DigiTrad/DigiTradSearch.html |[3] http://web2.xerox.com/digitrad/titles |[4] http://web2.xerox.com/digitrad/short |[5] http://web2.xerox.com/digitrad/list=100 |[6] http://web2.xerox.com/docs/DigiTrad/DigiTradKeywords.html |[7] http://web2.xerox.com/docs/DigiTrad/DigiTradTitles.html |[8] http://web2.xerox.com/docs/DigiTrad/DigiTradTunes.html .----------------------------------------------------------------------- For more information about this WWW by mail service, send the word "help" to listproc@www0.cern.ch . Experiences =========== I have tried to send search requests to the Lycos data base search page, but so far without luck. If you find a way of doing this, please share. It would make searching so much more productive for me. --- end --- The Online World Monitor newsletter ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ is a bi-monthly, ASCII online product. Initially meant as a free, optional offering for supporters of The Online World resources handbook, it is also open for subscription by others. The newsletter and the book are companions. While the book describes the online world as it is, the newsletter tracks changes. It can more freely focus on selected offerings or phenomena than can be done within the strict framework of the book. For more about the newsletter, send email to LISTSERV@VM1.NODAK.EDU with the following command in the TEXT of your mail: GET TOW MONITOR Add the following command for information about The Online World resources handbook: GET TOW INDEX Information is also available by gopher cosn.org. Select Networking Information/Reference/The Online World. Thanks, Odd de Presno